5 research outputs found
An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning
DQN (Deep Q-Network) is a method to perform Q-learning for reinforcement
learning using deep neural networks. DQNs require a large buffer and batch
processing for an experience replay and rely on a backpropagation based
iterative optimization, making them difficult to be implemented on
resource-limited edge devices. In this paper, we propose a lightweight
on-device reinforcement learning approach for low-cost FPGA devices. It
exploits a recently proposed neural-network based on-device learning approach
that does not rely on the backpropagation method but uses OS-ELM (Online
Sequential Extreme Learning Machine) based training algorithm. In addition, we
propose a combination of L2 regularization and spectral normalization for the
on-device reinforcement learning so that output values of the neural network
can be fit into a certain range and the reinforcement learning becomes stable.
The proposed reinforcement learning approach is designed for PYNQ-Z1 board as a
low-cost FPGA platform. The evaluation results using OpenAI Gym demonstrate
that the proposed algorithm and its FPGA implementation complete a CartPole-v0
task 29.77x and 89.40x faster than a conventional DQN-based approach when the
number of hidden-layer nodes is 64
An On-Device Federated Learning Approach for Cooperative Anomaly Detection
Most edge AI focuses on prediction tasks on resource-limited edge devices
while the training is done at server machines. However, retraining or
customizing a model is required at edge devices as the model is becoming
outdated due to environmental changes over time. To follow such a concept
drift, a neural-network based on-device learning approach is recently proposed,
so that edge devices train incoming data at runtime to update their model. In
this case, since a training is done at distributed edge devices, the issue is
that only a limited amount of training data can be used for each edge device.
To address this issue, one approach is a cooperative learning or federated
learning, where edge devices exchange their trained results and update their
model by using those collected from the other devices. In this paper, as an
on-device learning algorithm, we focus on OS-ELM (Online Sequential Extreme
Learning Machine) to sequentially train a model based on recent samples and
combine it with autoencoder for anomaly detection. We extend it for an
on-device federated learning so that edge devices can exchange their trained
results and update their model by using those collected from the other edge
devices. This cooperative model update is one-shot while it can be repeatedly
applied to synchronize their model. Our approach is evaluated with anomaly
detection tasks generated from a driving dataset of cars, a human activity
dataset, and MNIST dataset. The results demonstrate that the proposed on-device
federated learning can produce a merged model by integrating trained results
from multiple edge devices as accurately as traditional backpropagation based
neural networks and a traditional federated learning approach with lower
computation or communication cost
On-Device Learning: A Neural Network Based Field-Trainable Edge AI
In real-world edge AI applications, their accuracy is often affected by
various environmental factors, such as noises, location/calibration of sensors,
and time-related changes. This article introduces a neural network based
on-device learning approach to address this issue without going deep. Our
approach is quite different from de facto backpropagation based training but
tailored for low-end edge devices. This article introduces its algorithm and
implementation on a wireless sensor node consisting of Raspberry Pi Pico and
low-power wireless module. Experiments using vibration patterns of rotating
machines demonstrate that retraining by the on-device learning significantly
improves an anomaly detection accuracy at a noisy environment while saving
computation and communication costs for low power.Comment: Power values are updated with a new wireless module from v